Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Oct 31, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Oct 31, 2025
@pull pull bot added the ⤵️ pull label Oct 31, 2025
nico and others added 28 commits November 5, 2025 07:40
…ster (#164479)

Technically, it is possible that the a callee-saved register is saved in
different locations. CFIInstrInserter should handle this, but currently
it does not.
Use cannotHoistOrSinkRecipe to forbid sinking allocas.
When ISL encounters an internal error, it sets the error flag, but it is
not isl_error_quota that was already checked. Check for general errors
and abort the schedule optimization if that happens, instead of
continuing on the good path.

The error occured when compiling llvm-test-suite's
MultiSource/Applications/JM/lencod/leaky_bucket.c with Polly enabled.
Not adding a test case because it depends on ISL internals. We do not
want to a test case to depend on which version of ISL is used.
Currently, the ARM backend incorrectly parses every `arm` prefixed arch
to be non-thumb, but `armv6m` is THUMB and doesnt have ARM ops causing
the test to fail when compiling to assembly and not LLVM IR: `error:
Function 'foo' uses ARM instructions, but the target does not support
ARM mode execution.` This only happens when invoking cc1 directly and
not the Clang driver.

As a quick triage, this patch changes the tests to use `thumb`.

Uncovered by #151404
…160536)

As discussed in #153402, we have inefficiences in handling constant pool
access that are difficult to address. Using an IR pass to promote double
constants to a global allows a higher degree of control of code
generation for these accesses, resulting in improved performance on
benchmarks that might otherwise have high register pressure due to
accessing constant pool values separately rather than via a common base.

Directly promoting double constants to separate global values and
relying on the global merger to do a sensible thing would be one
potential avenue to explore, but it is _not_ done in this version of the
patch because:
* The global merger pass needs fixes. For instance it claims to be a
function pass, yet all of the work is done in initialisation. This means
that attempts by backends to schedule it after a given module pass don't
actually work as expected.
* The heuristics used can impact codegen unexpectedly, so I worry that
tweaking it to get the behaviour desired for promoted constants may lead
to other issues. This may be completely tractable though.

Now that #159352 has landed, the impact on terms if dynamically executed
instructions is slightly smaller (as we are starting from a better
baseline), but still worthwhile in lbm and nab from SPEC. Results below
are for rva22u64:

```
Benchmark                  Baseline         This PR   Diff (%)
============================================================
============================================================
500.perlbench_r         180668945687    180666122417     -0.00%
502.gcc_r               221274522161    221277565086      0.00%
505.mcf_r               134656204033    134656204066      0.00%
508.namd_r              217646645332    216699783858     -0.44%
510.parest_r            291731988950    291916190776      0.06%
511.povray_r             30983594866     31107718817      0.40%
519.lbm_r                91217999812     87405361395     -4.18%
520.omnetpp_r           137699867177    137674535853     -0.02%
523.xalancbmk_r         284730719514    284734023366      0.00%
525.x264_r              379107521547    379100250568     -0.00%
526.blender_r           659391437610    659447919505      0.01%
531.deepsjeng_r         350038121654    350038121656      0.00%
538.imagick_r           238568674979    238560772162     -0.00%
541.leela_r             405660852855    405654701346     -0.00%
544.nab_r               398215801848    391352111262     -1.72%
557.xz_r                129832192047    129832192055      0.00%

```

---
Notes for reviewers:
* As discussed at the sync-up meeting, the suggestion is to try to land
an incremental improvement to the status quo even if there is more work
to be done around the general issue of constant pool handling. We can
discuss here if that is actually the best next step or not, but I just
wanted to clarify that's why this is being posted with a somewhat narrow
scope.
* I've disabled transformations both for RV32 and on systems without D
as both cases saw some regressions.
Fixes #165346

This patch renames stale variable names where `TypeSourceInfo` objects
were still using the old `DI` (`DeclaratorInfo`) naming convention.

Specifically, variables of type `TypeSourceInfo` have been updated from
`DI` to `TSI` to improve code clarity and maintain consistency with the
current naming.
The debug info attached to the BUNDLE is the first instruction in the
BUNDLE, even if a better debug info (line:column) is present in the
later instructions of the bundle. The patch tries to get a better debug
info first. If not, then a worse debug info without line number is
chosen.

---------

Co-authored-by: Vladislav Dzhidzhoev <[email protected]>
Co-authored-by: Orlando Cazalet-Hyams <[email protected]>
We need to allow BitCasts between pointer types to different prim types,
but that means we need to catch the problem at a later stage, i.e. when
loading the values.

Fixes #158527
Fixes #163778
Move to the libcall impl based functions.
Update tests to contain auto generated checks.
PR #165993 accidentally broke the lowering of the `test.wait` Op.

This patch fixes the issue and adds tests to verify the lowering to
intrinsics for all mbarrier Ops, ensuring similar regressions are caught in the
future.

Additionally, the `cp-async-mbarrier` test is moved to the
`mbarriers.mlir` test file to keep all related tests together.

Signed-off-by: Durgadoss R <[email protected]>
#166536)

…e size is unknown

Keep _negative suffix only for test cases when the size is negative
… checks (#148810)

This PR adds support for the NOTIFY specifier in the image selector as
described in the 2023 standard, and add checks for the NOTIFY_TYPE type.
…ional (#166032)

This picks up from #166028, making the `Function` argument optional:
most cases don't need to provide it, but in e.g. InstCombine's case,
where the instruction (select, branch) is not attached to a function
yet, the function needs to be passed explicitly.

Co-authored-by: Florian Hahn <[email protected]>
…166078)

In the following example, `Functor::method()` inappropriately triggers a
diagnostic that `outer()` is blocking by allocating memory.

```
void outer() [[clang::nonblocking]]
{
	struct Functor {
		int* ptr;
		
		void method() { ptr = new int; }
	};
}
```

---------

Co-authored-by: Doug Wyatt <[email protected]>
…5630)

When there's a deep inheritance hierarchy of multiple C++ classes (see
below), then the mangled name of a VFTable can include multiple key
nodes in the target name.
For example, in the following code, MSVC will generate mangled names for
the VFTables that have up to three key classes in the context.
<details><summary>Code</summary>

```cpp
class Base1 {
  virtual void a() {};
};
class Base2 {
  virtual void b() {}
};

class Ind1 : public Base1 {};
class Ind2 : public Base1 {};

class A : public Ind1, public Ind2 {};

class Ind3 : public A {};
class Ind4 : public A {};

class B : public Ind3, public Ind4 {};

class Ind5 : public B {};
class Ind6 : public B {};

class C : public Ind5, public Ind6 {};

int main() { auto i = new C; }
```
</details> 

This will include `??_7C@@6BInd1@@ind4@@ind5@@@` (and every other
combination). Microsoft's undname will demangle this to "const
C::\`vftable'{for \`Ind1's \`Ind4's \`Ind5'}". Previously, LLVM would
demangle this to "const C::\`vftable'{for \`Ind1'}".

With this PR, the output of LLVM's undname will be identical to
Microsoft's version. This changes `SpecialTableSymbolNode::TargetName`
to a node array which contains each key from the name. Unlike
namespaces, these keys are not in reverse order - they are in the same
order as in the mangled name.
Icohedron and others added 30 commits November 6, 2025 11:09
Fixes #145752 

This PR inverts the result of `firstbithigh` when targeting DirectX by
subtracting it from integer bitwidth - 1 to match the result from DXC.
The result is not inverted if `firstbithigh` returned -1 or when
targeting a backend other than DirectX.
…s still used outside of the block only

If the current node is a copyable node and its parent is copyable too
and still current node is only used outside, better to cancel scheduling
for such node, because otherwise there might be wrong def-use chain
  built during vectorization.

Fixes #166775
The paper is ensuring that a static_assert operand can not be deferred
until runtime; it must accept an integer constant expression which is
resolved at compile time.

Note, Clang extends what it considers to be a valid integer constant
expression, so this also verifies the expected extension diagnostics.
Introduce a common interface for operations with alignment attributes
across MemRef, Vector, and SPIRV dialects. The interface exposes
getMaybeAlign() to retrieve alignment as llvm::MaybeAlign.

This is the second part of the PRs addressing issue #155677.

Co-authored-by: Erick Ochoa Lopez <[email protected]>
This patch introduces a new way to reconstruct the thread stackframe
list.

New `SyntheticFrameProvider` classes can lazy fetch a StackFrame at
index using a provided StackFrameList.

In can either be the real unwinder StackFrameList or we could also chain
SyntheticFrameProviders to each others.

This is the foundation work to implement ScriptedFrameProviders, which
will come in a follow-up patch.

Signed-off-by: Med Ismail Bennani <[email protected]>

Signed-off-by: Med Ismail Bennani <[email protected]>
…6676)

There may be valid reasons for not being able to find an SDK. Right now,
it's printed as an error, which is causing confusion for users that
interpret the error as something fatal, and not something that can be
ignored.

rdar://155346799
This teases the SFINAE handling bits out of the CodeSynthesisContext,
and moves that functionality into SFINAETrap and a new class.

There is also a small performance benefit here:
<img width="1460" height="20" alt="image"
src="https://github.com/user-attachments/assets/aeb446e3-04c3-418e-83de-c80904c83574"
/>
…166662)

This patch implements the base and python interface for the
ScriptedFrameProvider class.

This is necessary to call python APIs from the ScriptedFrameProvider
that will come in a follow-up.

Signed-off-by: Med Ismail Bennani <[email protected]>

Signed-off-by: Med Ismail Bennani <[email protected]>
…ct member functions (#165919)

Fixes #163731

---

This PR addresses false-positive shadow diagnostics for lambdas inside
explicit object member functions

```cpp
struct S {
  int x;
  void m(this S &self) {
    auto lambda = [](int x) { return x; }; // ok
  }
};
```
This patch enhances HexagonQFPOptimizer in multiple ways:

1. Refactor the code for better readability and maintainability.

2. Optimize vabs,vneg and vilog2 converts

   The three instruction mentioned can be optimized like below:

  ```v1.sf = v0.qf32
    v2.qf = vneg v1.sf```

  to

  ```v2.qf = vneg v0.qf32```

  This optimization eliminates one conversion and is applicable
  to both qf32 and qf16 types.

3. Enable vsub fusion with mixed arguments Previously, QFPOptimizer did
not fuse partial qfloat operands with vsub. This update allows selective
use of vsub_hf_mix, vsub_sf_mix, vsub_qf16_mix, and vsub_qf32_mix when
appropriate. It also enables QFP simplifications involving vector pair
subregisters.

Example scenario in a machine basic block targeting Hexagon: ```v1.qf32
= ... // result of a vadd
      v2.sf   = v1.qf32
      v3.qf32 = vmpy(v2.sf, v2.sf)```

4. Remove redundant conversions Under certain conditions, we previously
bailed out before removing qf-to-sf/hf conversions. This patch removes
that bailout, enabling more aggressive elimination of unnecessary
conversions.

5. Don't optimize equals feeding into multiply: Removing converts
feeding into multiply loses precision. This patch avoids optimizing
multiplies along with giving the users an option to enable this by a
flag.

Patch By: Fateme Hosseini

Co-authored-by: Kaushik Kulkarni <[email protected]>
Co-authored-by: Santanu Das <[email protected]>
This is canonical in the rest of the repository and otherwise we can end
up with warnings when compiling with clang-cl on Windows that look like
the following:

```
2025-11-06T17:55:25.2412502Z C:\_work\llvm-project\llvm-project\llvm\include\llvm/Support/thread.h(37,5): warning: 'LLVM_ON_UNIX' is not defined, evaluates to 0 [-Wundef]
2025-11-06T17:55:25.2413436Z    37 | #if LLVM_ON_UNIX || _WIN32
2025-11-06T17:55:25.2413791Z       |     ^
2025-11-06T17:55:25.2414625Z C:\_work\llvm-project\llvm-project\llvm\include\llvm/Support/thread.h(52,5): warning: 'LLVM_ON_UNIX' is not defined, evaluates to 0 [-Wundef]
2025-11-06T17:55:25.2415585Z    52 | #if LLVM_ON_UNIX
2025-11-06T17:55:25.2415901Z       |     ^
2025-11-06T17:55:25.2416169Z 2 warnings generated.
```

Reviewers: joker-eph, pcc, cachemeifyoucan

Reviewed By: cachemeifyoucan

Pull Request: #166827
We recently moved over to compiling with clang-cl on Windows. This ended
up causing a large increase in warnings, particularly due to how
warnings are handled in nanobind. cd91d0f
initially set -Wall -Wextra and -Wpedantic while fixing another issue,
which is probably not what we want to do on third-party code. We also
need to disable -Wmissing-field-initializers to get things clean in this
configuration.

Reviewers: makslevental, jpienaar, rkayaith

Reviewed By: makslevental

Pull Request: #166828
We removed the limit a while back after moving to new infrastructure but
never removed the comment. Do that now to prevent confusion.
…166005)

This fixes two problems:
- dyld itself resides within the shared cache. MemoryMappingLayout
incorrectly computes the slide for dyld's segments, causing them to
(appear to) overlap with other modules. This can cause symbolication
issues.
- The MemoryMappingLayout ranges on Darwin are not disjoint due to the
fact that the LINKEDIT segments overlap for each module. We now ignore
these segments to ensure the mapping is disjoint.

This adds a check for disjointness, and a runtime warning if this is
ever violated (as that suggests issues in the sanitizer memory mapping).
There is now a test to ensure that these problems do not recur.

rdar://163149325
…_v (#160607)

Implemented
[[*time.traits.is.clock*]](https://eel.is/c++draft/time.traits.is.clock)
from [P0355R7](https://wg21.link/p0355r7).

This patch implements the C++20 feature `is_clock` and `is_clock_v`
based on the documentation [on
cppreference](https://en.cppreference.com/w/cpp/chrono/is_clock.html)

Fixes #166049.
This PR adds the necessary infrastructure to enable testing of the
ACCImplicitData pass for FIR/HLFIR, along with comprehensive test
coverage for implicit data clause generation in OpenACC constructs.

New Infrastructure:
- Add FIROpenACCSupport analysis providing FIR-specific implementations
of OpenACCSupport interface methods for variable name extraction, recipe
name generation, and NYI emission
- Add FIROpenACCUtils with helper functions for:
  * Variable name extraction from FIR operations (getVariableName)
  * Recipe name generation with FIR type string representation
  * Bounds checking for constant array sections
- Add ACCInitializeFIRAnalyses pass to pre-register FIR analyses
(OpenACCSupport and AliasAnalysis) for use by subsequent OpenACC passes
in the pipeline

Refactoring in flang/lib/Lower/OpenACC.cpp:
- Move bounds string generation and bounds checking to FIROpenACCUtils
- Refactor recipe name generation to use fir::acc::getRecipeName

Test Coverage:
- acc-implicit-firstprivate.fir: Tests implicit firstprivate behavior
for scalar types (i8, i16, i32, i64, f32, f64, logical, complex) in
parallel/serial constructs with recipe generation verification
- acc-implicit-data.fir: Tests implicit data clauses for scalars,
arrays, derived types, and boxes in kernels/parallel/serial with
default(none) and default(present) variations
- acc-implicit-data-fortran.F90: Fortran tests verifying implicit data
generation through bbc with both HLFIR and FIR
- acc-implicit-data-derived-type-member.F90: Tests correct ordering of
parent/child data clause operations for derived type members
- acc-implicit-copy-reduction.fir: Tests enable-implicit-reduction-copy
flag controlling whether reduction variables use copy or firstprivate

This enables proper testing of implicit data clause generation through
the flang optimizer pipeline for OpenACC directives.
Main executables were bypassing the locate module callback that shared 
libraries use, preventing custom symbol file location logic from working
consistently. 

This PR fix this by
*   Adding target context to ModuleSpec
* Leveraging that context to use target search path and platform's
locate module callback in ModuleList::GetSharedModule

This ensures both main executables and shared libraries get the same 
callback treatment for symbol file resolution.

---------

Co-authored-by: George Hu <[email protected]>
Co-authored-by: George Hu <[email protected]>
A global offset table is a section that holds the address of functions
that are dynamically linked. The Swift plugin needs to know if sections
are a global offset table or not.
Looks like #166517 is breaking
libc-riscv32-qemu-yocto-fullbuild-dbg build due to failing overflow test
for strfrom.
https://lab.llvm.org/buildbot/#/changes/58668

```
int result = func(buff, sizeof(buff), "%.2147483647f", 1.0f);
EXPECT_LT(result, 0);
ASSERT_ERRNO_FAILURE();
```

```
[ RUN      ] LlvmLibcStrfromdTest.CharsWrittenOverflow
/home/libcrv32buildbot/bbroot/libc-riscv32-qemu-yocto-fullbuild-dbg/llvm-project/libc/test/src/stdlib/StrfromTest.h:493: FAILURE
       Expected: result
       Which is: 0
To be less than: 0
       Which is: 0
/home/libcrv32buildbot/bbroot/libc-riscv32-qemu-yocto-fullbuild-dbg/llvm-project/libc/test/src/stdlib/StrfromTest.h:494: FAILURE
          Expected: 0
          Which is: 0
To be not equal to: static_cast<int>(libc_errno)
          Which is: 0
[  FAILED  ] LlvmLibcStrfromdTest.CharsWrittenOverflow
Ran 8 tests.  PASS: 7  FAIL: 1
```

At first glance it seem like there is some kind of overflow in
internal::strfromfloat_convert on 32bit archs because the other overflow
test case is passing for snprintf. Interestingly, it passes on all other
buildbots, including libc-arm32-qemu-debian-dbg.

This issue likely wasn't introduced by
#166517 and was probably
already present, so I'm not reverting the change just disabling the test
case on riscv32 until I can debug properly.
…de braced initializers (#166180)

Fixes #163498

---

This PR addresses the issue of confusing diagnostics for lambdas with
init-captures appearing inside braced initializers.

Cases such as:

```cpp
S s{[a(42), &] {}};
```

were misparsed as C99 array designators, producing unrelated
diagnostics, such as `use of undeclared identifier 'a'`, and `expected
']'`

---


https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/lib/Parse/ParseInit.cpp#L470


https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/lib/Parse/ParseInit.cpp#L74


https://github.com/llvm/llvm-project/blob/bb9bd5f263226840194b28457ddf9861986db51f/clang/include/clang/Parse/Parser.h#L4652-L4655


https://github.com/llvm/llvm-project/blob/24c22b7de620669aed9da28de323309c44a58244/clang/lib/Parse/ParseExprCXX.cpp#L871-L879

The tentative parser now returns `Incomplete` for partially valid lambda
introducers, preserving the `lambda` interpretation and allowing the
proper diagnostic to be issued later.

---

Clang now correctly recognizes such constructs as malformed lambda
introducers and emits the expected diagnostic — for example,
“capture-default must be first” — consistent with direct initialization
cases such as:

```cpp
S s([a(42), &] {});
```
The low hanging fruit that was causing the vast majority of these
warnings has been fixed, so reenable them now. There are still a couple
more warnings that could probably do with some cleanup, but those can be
fixed in the future.
This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.

Fix BR_CC/MEMCPY descriptions to match C++ code that creates the nodes
(an error detected by the enabled verification functionality).

Also remove redundant `SDNPOutGlue` on `BPFISD::MEMCPY`.

Part of #119709.
…162822)

Check that all partial reductions in a chain are only used by other
partial reductions with the same scale factor. Otherwise we end up
creating users of scaled reductions where the types of the other
operands don't match.

A similar issue was addressed in
#158603, but misses the chained
cases.

Fixes #162530.

PR: #162822
…tions (#166776)

Seeing warnings:

llvm/include/llvm/CodeGen/LibcallLoweringInfo.h:15:46: error:
'visibility' attribute ignored [-Werror=attributes]
15 |   LLVM_ABI const RTLIB::RuntimeLibcallsInfo &RTLCI;
llvm/include/llvm/CodeGen/LibcallLoweringInfo.h:18:25: error:
'visibility' attribute ignored [-Werror=attributes]
18 |       RTLIB::Unsupported};
Fix shared library linking failure for FIROpenACCTransforms
…#166674)

Remove the unnecessary sleep in MachProcess::AttachForDebug. The
preceding comment makes it seem like it's necessary for synchronization,
though I don't believe that's the case (see below), and even if it were,
sleeping is not a reliable way to achieve that.

The reason I don't believe it's necessary is because after we return, we
synchronize with the exception thread on a state change. The latter will
call and update the process state, which is exactly what we synchronize
on. I was able to verify that this is the first time we change the
process state: i.e., `GetState` doesn't return a different value before
and after the sleep.

On top of that, there are 3 more places where we call ptrace attach
(`PosixSpawnChildForPTraceDebugging`, `SBLaunchForDebug`, and
`BoardServiceLaunchForDebug`) where we don't sleep.

rdar://163952037
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.